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ABSTRACT 

We define a new interactive differentially private mechanism 
— the median mechanism — for answering arbitrary predi- 
cate queries that arrive online. Given fixed accuracy and pri- 
vacy constraints, this mechanism can answer exponentially 
more queries than the previously best known interactive pri- 
vacy mechanism (the Laplace mechanism, which indepen- 
dently perturbs each query result). With respect to the 
number of queries, our guarantee is close to the best possible, 
even for non-interactive privacy mechanisms. Conceptually, 
the median mechanism is the first privacy mechanism capa- 
ble of identifying and exploiting correlations among queries 
in an interactive setting. 

We also give an efficient implementation of the median 
mechanism, with running time polynomial in the number 
of queries, the database size, and the domain size. This 
efficient implementation guarantees privacy for all input 
databases, and accurate query results for almost all input 
distributions. The dependence of the privacy on the number 
of queries in this mechanism improves over that of the best 
previously known efficient mechanism by a super-polynomial 
factor, even in the non-interactive setting. 
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1. INTRODUCTION 

Managing a data set with sensitive but useful information, 
such as medical records, requires reconciling two objectives: 
providing utility to others, perhaps in the form of aggregate 
statistics; and respecting the privacy of individuals who con- 
tribute to the data set. The field of private data analysis, 
and in particular work on differential privacy, provides a 
mathematical foundation for reasoning about this utility- 
privacy trade-off and offers methods for non-trivial data 
analysis that are provably privacy-preserving in a precise 
sense. For a recent survey of the field, see Dwork [Dwo08j . 

More precisely, consider a domain X and database size n. 
A mechanism is a randomized function from the set X n of 
databases to some range. For a parameter a > 0, a mech- 
anism M is a -differentially private if, for every database D 
and fixed subset S of the range of M, changing a single 
component of D changes the probability that M outputs 
something in S by at most an e a factor. The output of a 
differentially private mechanism (and any analysis or pri- 
vacy attack that follows) is thus essentially independent of 
whether or not a given individual "opts in" or "opts out" of 
the database. 

Achieving differential privacy requires "sufficiently noisy" 
answers DN03. For example, suppose we're interested in 
the result of a query / — a function from databases to some 
range — that simply counts the fraction of database ele- 
ments that satisfy some predicate ip on X. A special case of 
a result in Dwork et al. DMNS06 asserts that the follow- 
ing mechanism is a-differentially private: if the underlying 
database is D, output /(D) + A, where the output perturba- 
tion A is drawn from the Laplace distribution Lap(^-) with 
density p(y) = exp(— na|y|). Among all a-differentially 
private mechanisms, this one (or rather, a discretized analog 
of it) maximizes user utility in a strong sense [GRS09 . 

What if we care about more than a single one-dimensional 
statistic? Suppose we're interested in k predicate queries 
fi, ■ ■ ■ , fk, where k could be large, even super-polynomial 
in n. A natural solution is to use an independent Laplace 
perturbation for each query answer DMNS06 . To maintain 
a-differential privacy, the magnitude of noise has to scale lin- 
early with k, with each perturbation drawn from Lap( — ). 
Put another way, suppose one fixes "usefulness parameters" 
e, S, and insists that the mechanism is (e, S) -useful, meaning 
that the outputs are within e of the correct query answers 
with probability at least 1 — 5. This constrains the magni- 
tude of the Laplace noise, and the privacy parameter a now 
suffers linearly with the number k of answered queries. This 



dependence limits the use of this mechanism to a sublinear 
k = o(n) number of queries. 

Can we do better than independent output perturbations? 
For special classes of queries like predicate queries, Blum, 
Ligett, and Roth BiJSQS give an affirmative an swer (build - 
ing on techniques of Kasiviswanathan et al. [KLN + 08| ). 
Specifically, in BLR08] the exponential mechanism of Mc- 
Sherry and Talwar MT07] is used to show that, for fixed 
usefulness parameters e, S, the privacy parameter a only has 
to scale logarithmically with the number of queriesQ This 
permits simultaneous non-trivial utility and privacy guaran- 
tees even for an exponential number of queries. Moreover, 
this dependence on log k is necessary in every differentially 
private mechanism (see the full version of |BLR08] V 

The mechanism in BLR08 suffers from two drawbacks, 
however. First, it is non-interactive: it requires all queries 
/i, . . . , fk to be given upfront, and computes (noisy) outputs 
of all of them at onceQ By contrast, independent Laplace 
output perturbations can obviously be implemented inter- 
actively, with the queries arriving online and each answered 
immediately. There is good intuition for why the non- 
interactive setting helps: outperforming independent out- 
put perturbations requires correlating perturbations across 
multiple queries, and this is clearly easier when the queries 
are known in advance. Indeed, prior to the present work, 
no interactive mechanism better than independent Laplace 
perturbations was known. 

Second, the mechanism in BLR08 is inefficient. Here 
by "efficient" we mean ha s running time polynomial in n, 
k, and \X\; Dwork et al. [DNR+09] prove that this is es- 
sentially the best one could hope for (under certain cryp- 
tographic assumptions). The mechanism in [BLR08] is not 
efficient because it requires sampling from a non-trivial prob- 
ability distribution ove r an unstru ctured space of exponen- 
tial size. Dwork et al. [DNR + 09] recently gave an efficient 
(non-interactive) mechanism that is better than independent 
Laplace perturbations, in that the privacy parameter a of 
the mechanism scales as 2^ log k with the number of queries k 
(for fixed usefulness parameters e, 5). 

Very recently, Hardt and Talwar |HT10| gave upper and 
lower bounds for answering noninteractive linear queries 
which are tight in a related setting. These bounds are not 
tight in our setting, however, unless the number of queries 
is small with respect to the size of the database. When the 
number of queries is large, our mechanism actually yields er- 
ror significantly less than required in general by their lower 
bound 0- This is not a contradiction, because when trans- 
lated into the setting of |HT10I , our database size n becomes 
a sparsity parameter that is not considered in their bounds. 



1 More generally, linearly with the VC dimension of the set 
of queries, which is always at most log 2 k. 

2 Or rather, it computes a compact representation of these 
outputs in the form of a synthetic database. 

3 We give a mechanism for answering k "counting queries" 
with coordinate-wise error 0(n 2/3 log fc(log|X | /a) 1/3 ). This 
is less error than required by their lower bound of roughly 
Q(^klog(\X\/k)/a) unless k < 6({na/ log \X\) 4/3 ). We 

can take k to be as large as k = £l{2 (na/ log W 1/3 ), in which 
case our upper bound is a signific ant improv ement - as are 
the upper bounds of |BLR08| and |DNR+09| . 



1.1 Our Results 

We define a new interactive differentially private mecha- 
nism for answering k arbitrary predicate queries, called the 
median mechanism^ The basic implementation of the me- 
dian mechanism interactively answers queries fx, . . . , fk that 
arrive online, is (e, 5)-useful, and has privacy a that scales 
with log k log \X\; see Theorem 14. II for the exact statement. 
These privacy and utility guarantees hold even if an adver- 
sary can adaptively choose each fi after seeing the mecha- 
nism's first i — 1 answers. This is the first interactive mech- 
anism better than the Laplace mechanism, and its perfor- 
mance is close to the best possible even in the non-interactive 
setting. 

The basic implementation of the median mechanism is 
not efficient, and we give an efficient implementation with a 
somewhat weaker utility guarantee. (The privacy guarantee 
is as strong as in the basic implementation.) This alternative 
implementation runs in time polynomial in n, k, and \X\, 
and satisfies the following (Theorem l5.I[) : for every sequence 
/i, . . . , fk of predicate queries, for all but a negligible frac- 
tion of input distributions, the efficient median mechanism 
is (e, <5)-useful. 

This is the first efficient mechanism with a non-trivial util- 
ity guarantee and polylogarithmic privacy cost, even in the 
non-interactive setting. 

1.2 The Main Ideas 

The key challenge to designing an interactive mechanism 
that outperforms the Laplace mechanism lies in determining 
the appropriate correlations between different output per- 
turbations on the fly, without knowledge of future queries. 
It is not obvious that anything significantly better than inde- 
pendent perturbations is possible in the interactive setting. 

Our median mechanism and our analysis of it can be sum- 
marized, at a high level, by three facts. First, among any set 
of k queries, we prove that there are 0(logfclog |X|) "hard" 
queries, the answers to which completely determine the an- 
swers to all of the other queries (up to ±e). Roughly, this 
holds because: (i) by a VC dimension argument, we can fo- 
cus on databases over X of size only 0(log k); and (ii) every 
time we answer a "hard" query, the number of databases 
consistent with the mechanism's answers shrinks by a con- 
stant factor, and this number cannot drop below 1 (because 
of the true input database). Second, we design a method to 
privately release an indicator vector which distinguishes be- 
tween hard and easy queries online. We note that a similar 
pri vate 'indica tor vector' technique was used by Dwork et 
al. |DNR + 09] , Essentially, the median mechanism deems a 
query "easy" if a majority of the databases that are consis- 
tent (up to ±e) with the previous answers of the mechanism 
would answer the current query accurately. The median 
mechanism answers the small number of hard queries us- 
ing independent Laplace perturbations. It answers an easy 
query (accurately) using the median query result given by 
databases that are consistent with previous answers. A key 
intuition is that if a user knows that query i is easy, then it 
can generate the mechanism's answer on its own. Thus an- 
swering an easy query communicates only a single new bit of 
information: that the query is easy. Finally, we show how to 
release the classification of queries as "easy" and "hard" with 



4 The privacy guarantee is (a, r)-differential privacy for a 
negligible function r; see Section [2] for definitions. 



low privacy cost; intuitively, this is possible because (inde- 
pendent of the database) there can be only 0(logfclog \X\) 
hard queries. 

Our basic implementation of the median mechanism is 
not efficient for the same reasons as for the mechanism 
in |BLR08| : it requires non-trivial sampling from a set of 
super-polynomial size. For our efficient implementation, we 
pass to fractional databases, represented as fractional his- 
tograms with components indexed by X. Here, we use 
the random walk technology of Dyer, Frieze, and Kan- 
nan [DFK91] for convex bodies to perform efficient random 
sampling. To explain why our utility guarantee no longer 
holds for every input database, recall the first fact used in the 
basic implementation: every answer to a hard query shrinks 
the number of consistent databases by a constant factor, and 
this number starts at |X|°' logfe - ) and cannot drop below 1. 
With fractional databases (where polytope volumes play the 
role of set sizes), the lower bound of 1 on the set of consis- 
tent (fractional) databases no longer holds. Nonetheless, we 
prove a lower bound on the volume of this set for almost 
all fractional histograms (equivalently probability distribu- 
tions), which salvages the 0(log k log \X\) bound on hard 
queries for databases drawn from such distributions. 

2. PRELIMINARIES 

We briefly formalize the setting of the previous section and 
record some important definitions. We consider some finite 
domain X, and define a database D to be an unordered set 
of elements from X (with multiplicities allowed). We write 
n — \D\ to denote the size of the database. We consider the 
set of Boolean functions (predicates) / : X — ¥ {0, f }. We 
abuse notation and define a predicate query f(D) : X* — ¥ 
[0, 1] as \{x G D : f(X) = 1}|/|-D|, the function that com- 
putes the fraction of elements of D that satisfy predicate /. 
We say that an answer ai to a query fi is e-accurate with 
respect to database D if \fi(D) — a,i\ < e. A mechanism 
M(D, (/i, . . . , fk)) is a function from databases and queries 
to distributions over outputs. In this paper, we consider 
mechanisms that answer predicate queries numerically, and 
so the range of our mechanisms is R fc 

Definition 1. A mechanism M is (e, 8) -useful if for every 
sequence of queries (/i, . . . , ft) and every database D, with 
probability at least 1 — 8 it provides answers ai, . . . , (Xk that 
are e-accurate for fi, - ■ ■ ,fk and D. 

Recall that differential privacy means that changing the 
identity of a single element of the input database does not 
affect the probability of any outcome by more than a small 
factor. Formally, given a database D, we say that a database 
D' of the same size is a neighbor of D if it differs in only a 
single element: \D n D'\ = \D\ - 1. 

Definition 2. A mechanism M satisfies (a, r)- differential 



privacy if for every subset S C 



every set of queries 



(/i, . . . , fk), and every pair of neighboring databases D, D': 

Pl[M(D) G S] < e a ■ Pr[M(D') eS]+r. 

We are generally interested in the case where r is a negligible 
function of some of the problem parameters, meaning one 
that goes to zero faster than x~ c for every constant c. 

5 From e-accurate answers, one can efficiently reconstruct a 
synthetic database that is co nsistent (up to ±e) with those 
answers, if desired [DNR + 09j . 



Finally, the sensitivity of a real- valued query is the largest 
difference between its values on neighboring databases. For 
example, the sensitivity of every non-trivial predicate query 
is precisely 1/n. 

3. THE MEDIAN MECHANISM: BASIC 
IMPLEMENTATION 

We now describe the median mechanism and our basic 
implementation of it. As described in the Introduction, the 
mechanism is conceptually simple. It classifies queries as 
"easy" or "hard", essentially according to whether or not a 
majority of the databases consistent with previous answers 
to hard queries would give an accurate answer to it (in which 
case the user already "knows the answer"). Easy queries 
are answered using the corresponding median value; hard 
queries are answered as in the Laplace mechanism. 

To explain the mechanism precisely, we need to discuss 
a number of parameters. We take the privacy parameter 
a, the accuracy parameter e, and the number k of queries 
as input; these are hard constraints on the performance of 
our mechanism^ Our mechanism obeys these constraints 
with a value of 8 that is inverse polynomial in k and n, 
and a value of r that is negligible in k and n, provided 
n is sufficiently large (at least polylogarithmic in k and 
\X\, see Theorem |4.1| l. Of course, such a result can be 
rephrased as a nearly exponential lower bound on the num- 
ber of queries k that can be successfully answered as a func- 
tion of the database size 

The median mechanism is shown in Figure [T] and it makes 
use of several additional parameters. For our analysis, we 
set their values to: 



7 : 



720m In | X\ 



4 , 2k 
In — 

a' en a 



160000 In k In ± 



e 



e 



log|X|logfclogi 
log|X|log 2 fclogi 



(1) 



(2) 



(3) 



The denominator in ([2j can be thought of as our "privacy 
cost" as a function of the number of queries k. Needless to 
say, we made no effort to optimize the constants. 

The value n in Step 2(a) of the median mechanism is 
defined as 



Esec^expt-e- 1 !/^)-/^)! 

|Ci-l 



(4) 



For the Laplace perturbations in Steps 2(a) and 2(d), recall 
that the distribution Lap (a) has the cumulative distribution 
function 



F(x) = I - F{-x) = I - -eT* 1 ". 



(5) 



We typically think of a, e as small constants, though our 
results remain meaningful for some sub-constant values of a 
and e as well. We always assume that a is at least inverse 
polynomial in k. Note that when a or e is sufficiently small 
(at most c/n for a small constant c, say), simultaneously 
meaningful privacy and utility is clearly impossible. 
7 In contrast, the number of queries that the Laplace mech- 
anism can privately and usefully answer is at most linear. 



1. Initialize Co = { databases of size m over X }. 

2. For each query fi, f2, ■ ■ ■ , fk in turn: 

(a) Define rj as in (|4j| and let fi = n + Lap( J* , )■ 

(b) Let tj = | + 3 ■ 7, where j € {0, 1,..., is 
chosen with probability proportional to 2~ 3 . 

(c) If fi > ti, set o» to be the median value of fi on 
Ci-i. 

(d) If n < U, set oi to be fi(D) + Lap(^). 

(e) If fi < ti, set Ci to the databases S of d-i with 

— ai| < e/50; otherwise Ci = Gi-i. 

(f) If f j < tj for more than 20m log \X\ values of j < 
i, then halt and report failure. 

Figure 1: The Median Mechanism. 



The motivation behind the mechanism's steps is as fol- 
lows. The set d is the set of size-m databases consistent 
(up to ±e/50) with previous answers of the mechanism to 
hard queries. The focus on databases with the small size 
m is justified by a VC dimension argument, see Proposi- 
tion |4]6l Steps 2(a) and 2(b) choose a random value fi and 
a random threshold ti. The value Ti in Step 2(a) is a measure 
of how easy the query is, with higher numbers being easier. 
A more obvious measure would be the fraction of databases 
S in d-i for which \fi(S) — fi(D)\ < e, but this is a highly 
sensitive statistic (unlike n, see Lemma |4.9[) . The mecha- 
nism uses the perturbed value fi rather than ri to privately 
communicate which queries are easy and which are hard. 
In Step 2(b), we choose the threshold ti at random between 
3/4 and 9/10. This randomly shifted threshold ensures that, 
for every database D, there is likely to be a significant gap 
between ri and ti; such gaps are useful when optimizing the 
privacy guarantee. Steps 2(c) and 2(d) answer easy and hard 
queries, respectively. Step 2(e) updates the set of databases 
consistent with previous answers to hard queries. We prove 
in Lemma 14.71 that Step 2(f) occurs with at most inverse 
polynomial probability. 

Finally, we note that the median mechanism is defined as 
if the total number of queries k is (approximately) known 
in advance. This assumption can be removed by using suc- 
cessively doubling "guesses" of k; this increases the privacy 
cost by an 0(log k) factor. 

4. ANALYSIS OF MEDIAN MECHANISM 

This section proves the following privacy and utility guar- 
antees for the basic implementation of the median mecha- 
nism. 

Theorem 4.1. For every sequence of adaptively chosen 
predicate queries fi, ■ ■ ■ , fk arriving online, the median 
mechanism is (e, S)-useful and (a, r) -differentially private, 
where r is a negligible function of k and \X\, and 8 is an in- 
verse polynomial function of k and n, provided the database 
size n satisfies 

301n^log 2 fc /log|X|log 3 /Uogi\ 
ct'e \ cte A J 



We prove the utility and privacy guarantees in Sections 14.11 
and 14.21 respectively^ 

4.1 Utility of the Median Mechanism 

Here we prove a utility guarantee for the median mecha- 
nism. 

Theorem 4.2. The median mechanism is (e, 8) -useful, 
where 8 — k exp(— Q(ena')). 

Note that under assumption (J(j]), 8 is inverse polynomial in 
k and n. 

We give the proof of Theorem 14.21 in three pieces: with 
high probability, every hard query is answered accurately 
(Lemma 14. 4ft ; every easy query is answered accurately 
(Lemmas 14.31 and I4.5|l : and the algorithm does not fail 
(Lemma 14. 7[) . The next two lemmas follow from the defi- 
nition of the Laplace distribution ©, our choice of 8, and 
trivial union bounds. 

Lemma 4.3. With probability at least 1 — |, \n — fi\ < 
1/100 for every query i. 

Lemma 4.4. With probability at least 1 — | , every answer 
to a hard query is (ej 100) -accurate for D. 

The next lemma shows that median answers are accurate 
for easy queries. 

Lemma 4.5. If \ri — f*| < 1/100 for every query i, then 
every answer to an easy query is e-accurate for D. 

PROOF. For a query i, let Gi_i = {S G d-i : \fi{D) - 
fi{S)\ < e} denote the databases of d-i on which the result 
of query fi is e-accurate for D. Observe that if |Gj-i| > 
.51 ■ |Ci_i|, then the median value of fi on d-i is an e- 
accurate answer for D. Thus proving the lemma reduces to 
showing that fi > 3/4 only if |G<_i| > .51 • |Cj_i|. 

Consider a query i with |Gj_i| < .51 ■ |Gj_i|. Using Q, 
we have 

E SeCi ^{-e-^fjjD) - fj(S)\) 
^ ~ ^1 

|Ci-i| 

\Ci-i\ 

74 
< 100' 

Since |r< — f»| < 1/100 for every query i by assumption, the 
proof is complete. □ 

Our final lemma shows that the median mechanism does 
not fail and hence answers every query, with high probabil- 
ity; this will conclude our proof of Theorem 14.21 We need 
the following preliminary proposition, which instantiates the 
standard uniform convergence bound with the fact that the 
VC dimension of every set of k predicate queries is at most 
log;, k JVap96| . Recall the definition of the parameter m 
from 

8 If desired, in Theorem 14.11 we can treat n as a pa- 
rameter and solve for the error e. The maximum er- 
ror on any query (normalized by the database size) is 
0(log k log 1 / 3 \X\/n 1 / z a 1 / 3 ); the unnormalized error is a 
factor of n larger. 



Proposition 4.6 (Uniform Convergence Bound). 
For every collection of k predicate queries /i, . . . , and 
every database D, a database S obtained by sampling points 
from D uniformly at random will satisfy \fi(D) — fi(S)\ < e 
for all i except with probability 5, provided 



\s\ > 



2c 2 



log k + log • 



In particular, there exists a database S of size m such that 
for alii 6 {1, . . . , k}, \U{D) - f t (S)\ < e/400. 

In other words, the results of k predicate queries on an ar- 
bitrarily large database can be well approximated by those 
on a database with size only O(logfc). 

Lemma 4.7. // jr 4 — | < 1/100 for every query i and 
every answer to a hard query is (e/ 400) -accurate for D, then 
the median mechanism answers fewer than 20m log \X\ hard 
queries (and hence answers all queries before terminating). 

Proof. The plan is to track the contraction of d as hard 
queries are answered by the median mechanism. Initially 
we have |Co| < |X| m . If the median mechanism answers a 
hard query i, then the definition of the mechanism and our 
hypotheses yield 

1 1 „ 91 

Ti < Ti 4 < ti 4 < • 

- 100 100 - 100 

We then claim that the size of the set d = {S £ Cj_i : 
\fi(S) — o>| < e/50} is at most ^|Ci_i|. For if not, 

Zsec^eM-^MS) ~ fi(D)\) 



> 



94 

Too 



• exp 



|Gt-i| 

1_\ _92_ 

50/ > 100' 



which is a contradiction. 

Iterating now shows that the number of consistent 
databases decreases exponentially with the number of hard 
queries: 



^ * {wo 



\x\ 



(7) 



if h of the k queries are hard. 

On the other hand, Proposition 14.61 guarantees the exis- 
tence of a database S* £ Co for which \fi(S*) — fi(D)\ < 
e/100 for every query fi. Since all answers ai produced 
by the median mechanism for hard queries i are (e/100)- 
accurate for D by assumption, \fi{S*) — a<| < \fi(S*) — 
fi(D)\ + \fi(D) - a»| < e/50. This shows that 5* G C k and 
hence \Ck\ > 1. Combining this with @ gives 



h < 



m\n\X\ 
ln(50/47) 



< 20m In 1X1 



as desired. □ 



4.2 Privacy of the Median Mechanism 

This section establishes the following privacy guarantee 
for the median mechanism. 

Theorem 4.8. The median mechanism is (a,r)- differ- 
entially private, where r is a negligible function of \X\ and 
k when n is sufficiently large (as in ©j. 



We can treat the median mechanism as if it has two out- 
puts: a vector of answers a £ M fc , and a vector d £ {0, l} k 
such that di = if i is an easy query and di = 1 if i is 
a hard query. A key observation in the privacy analysis 
is that answers to easy queries are a function only of the 
previous output of the mechanism, and incur no additional 
privacy cost beyond the release of the bit di. Moreover, the 
median mechanism is guaranteed to produce no more than 
0(m log \X\) answers to hard queries. Intuitively, what we 
need to show is that the vector d can be released after an 
unusually small perturbation. 

Our first lemma states that the small sensitivity of pred- 
icate queries carries over, with a 2/e factor loss, to the r- 
function defined in Q. 

Lemma 4.9. The function ri(D) = 
(Es6C ex P(^ e_1 |/(- D ) - f(S)\)/\C\ has sensitivity ^ 
for every fixed set C of databases and predicate query f . 

Proof. Let D and D' be neighboring databases. Then 
J2 SeC eM-^ 1 \f(D)-f(S)\) 



n(D) = 



\Ci 



E SeCi exp(- e - 1 (|/(Z)')-/(5)|-n- 1 )) 



exp 



l+±). n {D') 



< ri (D') + — 
en 

where the first inequality follows from the fact that the 
(predicate) query / has sensitivity 1/n, the second from the 
fact that e x < l + 2x when x £ [0, 1], and the third from the 
fact that n(D') < 1. □ 

The next lemma identifies nice properties of "typical ex- 
ecutions" of the median mechanism. Consider an output 
(d, a) of the median mechanism with a database D. From D 
and (d,a), we can uniquely recover the values r\ , . . ■ , rjb com- 
puted (via @) in Step 2(a) of the median mechanism, with 
rt depending only on the first i — 1 components of d and a. 
We sometimes write such a value as ri(D, (d, a)), or as ri(D) 
if an output (d, a) has been fixed. Call a possible threshold 
ti good for D and (d, a) if di = and ri(D, (d, a)) > ti + 7 , 
where 7 is defined as in (|3]l. Call a vector t of possible 
thresholds good for D and (d, a) if all but 180m In \ X\ of the 
thresholds are good for D and (d, a) . 

Lemma 4.10. For every database D, with all but negligi- 
ble (exp(— fi(log k log \X\/e 2 ))) probability, the thresholds t 
generated by the median mechanism are good for its output 
\d,a). 

Proof. The idea is to "charge" the probability of bad 
thresholds to that of answering hard queries, which are 
strictly limited by the median mechanism. Since the me- 
dian mechanism only allows 20mln|X| of the di's to be 1, 
we only need to bound the number of queries i with output 
di — and threshold ti satisfying < ti + 7, where ri is 
the value computed by the median mechanism in Step 2(a) 
when it answers the query i. 

Let Yi be the indicator random variable corresponding to 
the (larger) event that < ti + 7. Define Zi to be 1 if and 



only if, when answering the ith query, the median mecha- 
nism chooses a threshold ti and a Laplace perturbation Aj 
such that ri + Ai < ti (i.e., the query is classified as hard). If 
the median mechanism fails before reaching query i, then we 
define Y t = Z t = 0. Set Y = J2i=i Y i and z = E*=i z i- We 
can finish the proof by showing that Y is at most I60m In \X\ 
except with negligible probability. 

Consider a query i and condition on the event that 
r i > Tni this event depends only on the results of previ- 
ous queries. In this case, Yi = 1 only if ti — 9/10. But this 
occurs with probability 2" 3/2 ° 7 , which using © and © is 
at most l/fc[j Therefore, the expected contribution to Y 
coming from queries i with r, > ^ is at most 1. Since U 
is selected independently at random for each i, the Cher- 
noff bound implies that the probability that such queries 
contribute more than rain \X\ to Y is 

exp(^((ralog|X|) 2 )) =exp(-fi((logfc) 2 (log|X|) 2 A 4 )). 



< t + 



^2 Pr[MM(D,f) = (t,d,a) 



Now condition on the event that n < 



Let Ti denote 



the threshold choices that would cause Yi to be 1, and let Si 
be the smallest such; since r; < j^, \Ti\ > 2. For every ti £ 
Ti, U > ri — 7; hence, for every ti € Ti \ {si}, ti > n. Also, 
our distribution on the j's in Step 2(b) ensures that Pr[t; £ 
T \ {si}] > | Pr[ti £ Ti]. Since the Laplace distribution 
is symmetric around zero and the random choices Aj, ti are 
independent, we have 

E[Zi] = Pr[U >r, + A,} 

> Pt[U > ri] ■ Pr[Ai < 0] 

> \ Pr[U > n - 7] 

= |E[K]. (8) 

The definition of the median mechanism ensures that Z < 
20raln|X| with probability 1. Linearity of expectation, in- 
equality (H), and the Chernoff bound imply that queries 
with Ti < contribute at most 159m In \X\ to Y with prob- 
ability at least 1 — exp(— f2(logfclog |X|/e 2 )). The proof is 
complete. □ 



We can now prove Theorem 14.81 

Proof of Theorem \4-8[ Recall Definition [2] and fix a 
database D, queries /1 ,...,/*, and a subset S of possi- 
ble mechanism outputs. For simplicity, we assume that 
all perturbations are drawn from a discretized Laplace dis- 
tribution, so that the median mechanism has a countable 
range; the continuous case can be treated using similar ar- 
guments. Then, we can think of S as a countable set of 
output vector pairs (d,a) with d £ {0, l} k and a £ R fe . 
We write MM(D, f) = (d, a) for the event that the me- 
dian mechanism classifies the queries / = {fi, ■ ■ ■ , fk) ac- 
cording to d and outputs the numerical answers a. If the 
mechanism computes thresholds t while doing so, we write 
MM(D,f) = (t,d,a). Let G((d,a),D) denote the vec- 
tors that would be good thresholds for (d, a) and D. (Re- 
call that D and (d, a) uniquely define the corresponding 
n(D,(d,a))' S .) 
We have 

Pt[MM(D, f) € S] = J2 Pr[MM(L>, /) = (d, a)] 

(d,a)eS 



= r+ J2 E Pr[MM(D, f) = (t,d,a)] 

(d,a)eS t£G((d,a),D) 

with some t good for (d, a), D, and where r is the negligible 
function of Lemma 14.101 We complete the proof by show- 
ing that, for every neighboring database D' , possible output 
(d, a), and thresholds t good for (d,a) and D, 

Pv[MM{D,f) = (t,d,a)} < e a ■ Pr [MM (£>', f) = (t,d,a)]. 

(9) 

Fix a neighboring database D' , a target output (d, a), and 
thresholds t good for (d, a) and D. The probability that 
the median mechanism chooses the target thresholds t is 
independent of the underlying database, and so is the same 
on both sides of ©. For the rest of the proof, we condition 
on the event that the median mechanism uses the thresholds 
t (both with database D and database D'). 

Let Si denote the event that MM(D, f) classifies the first 
i queries in agreement with the target output (i.e., query 
j < i is deemed easy if and only if dj = 0) and that its first 
i answers are 01, . . . , at. Let £[ denote the analogous event 
for MM(D' , f). Observe that Sk, £'k are the relevant events 
on the left- and right-hand sides of ([9]), respectively (after 
conditioning on t). If (d, a) is such that the median mecha- 
nism would fail after the ^th query, then the following proof 
should be applied to £ e , £' t instead of £k,£'k- We next give a 
crude upper bound on the ratio Pr[£j|£j-i]/ Pr[f^|fj'_r] that 
holds for every query (see (|10p . below), followed by a much 
better upper bound for queries with good thresholds. 

Imagine running the median mechanism in parallel on 
D,D' and condition on the events £i-\,£' i _ 1 . The set 
d-i is then the same in both runs of the mechanism, and 
n(D),n(D') are now fixed. Let h (6<) be if MM(D,f) 
(MM(D',f)) classifies query i as easy and 1 otherwise. 
Since n(D') £ [n(D) ± Jj] (Lemma l4~9)l and a perturba- 
tion with distribution Lap(— ^— ) is added to these values 

r ^ a en ' 

before comparing to the threshold ti (Step 2(a)), 
Pr[6, = I < e a ' PrlK = | £^i] 

and similarly for the events where bi,b'i = 1. Suppose that 
the target classification is ck = 1 (a hard query), and let 
Si and s'i denote the random variables fi(D) + Lap(^^-) 
and fi{D') + Lap(^j^), respectively. Independence of the 
Laplace perturbations in Steps 2(a) and 2(d) implies that 



Pr[£i\£i-i] = Pr[bi = 1 I Ei-i] ■ Pr[ Si = o< | £i-i] 



and 



Pr\£l\£U] = Pr^ = 1 1 ei-i] ■ = a t \ EU\- 
Since the predicate query fi has sensitivity 1/n, we have 



Pr[£i\£>- l \<e 2a ' ■Pr\£[\£' 1 _ 1 



(10) 



9 For simplicity, we ignore the normalizing constant in the 
distribution over j's in Step 2(b), which is 0(1). 



when di — 1. 

Now suppose that di — 0, and let rrn denote the median 
value of ft on d-\. Then Pr[£j[£j_i] is either (if m, 7^ 
a») or Pr[6, = 0|£i_i] (if m, = a*); similarly, Pr^l^i] 
is either or Pr[6- = 0|£<_i]. Thus the bound in ([101 
continues to hold (even with e 2a replaced by e a ) when 
di = 0. 



Since a' is not much smaller than the privacy target a 
(recall ©), we cannot afford to suffer the upper bound 
in (|10l) for many queries. Fortunately, for queries i with 
good thresholds we can do much better. Consider a query i 
such that ti is good for (d, a) and D and condition again 
on which fixes C_i and hence Ti(U). Good- 

ness implies that di = 0, so the arguments from the pre- 
vious paragraph also apply here. We can therefore assume 
that the median value m,i of /» on Cj_i equals at and focus 
on bounding Pr[i>i = [ in terms of Pr[&£ = 0|£j'_i]. 

Goodness also implies that ri(D) > ti + 7 and hence 
r-i(D') > ^ + 7 - £ > a + % (by Lemma SU. Recall- 
ing from ((3]) the definition of 7, we have 



Pr$ = I 



Pr \n 

1 



1 — fa en/4 
— p 



4 A: 



(11) 



and of course, Pr[bi = | £i-i] < 1. 

Applying (|10|l to the bad queries — at most 180m In \X\ 
of them, since t is good for (d, a) and Z? — and (|lip to the 
rest, we can derive 

k 



- 360a'mln|X| t1 a \-k TT r , rc ./ £./ i 
<e = / 2 by (2) 



<(l+£) k <e«' 2 



< e"-Pr[4], 



which completes the proof of both the inequality @ and the 
theorem. ■ 

5. THE MEDIAN MECHANISM: 
EFFICIENT IMPLEMENTATION 

The basic implementation of the median mechanism runs 
in time \x\ e(logklog{1/c)/t2) . This section provides an effi- 
cient implementation, running in time polynomial in n, k, 
and \X\, although with a weaker usefulness guarantee. 

Theorem 5.1. Assume that the database size n satis- 
fies ©. For every sequence of adaptively chosen predicate 
queries fi,...,fk arriving online, the efficient implementa- 
tion of the median Mechanism is (a, t)- differentially private 
for a negligible function r. Moreover, for every fixed set 
fi, ■ ■ ■ ,fk of queries, it is (e, S) -useful for all but a negligi- 
ble fraction of fractional databases (equivalently, probability 
distributions). 

Specifically, our mechanism answers exponentially many 
queries for all but an 0(|X| -m ) fraction of probability dis- 
tributions over X drawn from the unit l\ ball, and from 
databases drawn from such distributions. Thus our efficient 
implementation always guarantees privacy, but for a given 
set of queries /1 , . . . , fk , there might be a negligibly small 
fraction of fractional histograms for which our mechanism is 
not useful for all k queries. 

We note however that even for the small fraction of frac- 
tional histograms for which the efficient median mechanism 
may not satisfy our usefulness guarantee, it does not output 
incorrect answers: it merely halts after having answered a 



sufficiently large number of queries using the Laplace mech- 
anism. Therefore, even for this small fraction of databases, 
the efficient median mechanism is an improvement over the 
Laplace mechanism: in the worst case, it simply answers ev- 
ery query using the Laplace mechanism before halting, and 
in the best case, it is able to answer many more queries. 

We give a high-level overview of the proof of Theorem l5.ll 
which we then make formal. First, why isn't the median 
mechanism a computationally efficient mechanism? Be- 
cause Co has super-polynomial size |X| m , and computing n 
in Step 2(a), the median value in Step 2(c), and the set d 
in Step 2(e) could require time proportional to | Co | - An 
obvious idea is to randomly sample elements of d-\ to ap- 
proximately compute n and the median value of fi on Cj_i; 
while it is easy to control the resulting sampling error and 
preserve the utility and privacy guarantees of Section [4] it 
is not clear how to sample from d-i efficiently. 

We show how to implement the median mechanism in 
polynomial time by redefining the sets d to be sets of prob- 
ability distributions over points in X that are consistent (up 
to ±7^j) with the hard queries answered up to the ith query. 
Each set d will be a convex polytope in R' x ' defined by the 
intersection of at most 0(m log \X\) halfspaces, and hence 
it will be possible to sample points from d approximately 
uniformly at random in time poly(|X|,m) via the grid walk 
of Dyer, Frieze, and Kannan [DFK91] . Lemmas B31 14.41 
and 14.51 still hold (trivially modified to accommodate sam- 
pling error). We have to reprove Lemma [4. 71 in a somewhat 
weaker form: that for all but a diminishing fraction of in- 
put databases D, the median mechanism does not abort ex- 
cept with probability k exp(— Q(ena')). As for our privacy 
analysis of the median mechanism, it is independent of the 
representation of the sets d and the mechanisms' failure 
probability, and so it need not be repeated — the efficient 
implementation is provably private for all input databases 
and query sequences. 

We now give a formal analysis of the efficient implemen- 
tation. 

5.1 Redefining the sets d 

We redefine the sets d to represent databases that can 
contain points fractionally, as opposed to the finite set of 
small discrete databases. Equivalently, we can view the sets 
d as containing probability distributions over the set of 
points X. 

We initialize Co to be the i\ ball of radius m in 
x 1 

mB[ , intersected with the non-negative orthant: 



Co = {F e 



p\x\ 



F>0, \\F\\i < m}. 



Each dimension i in R' x ' corresponds to an element Xi € 
X. Elements F £ Co can be viewed as fractional his- 
tograms. Note that integral points in Co correspond exactly 
to databases of size at most m. 

We generalize our query functions fi to fractional his- 
tograms in the natural way: 



The update operation after a hard query i is answered is 
the same as in the basic implementation: 



d 



{F £ d- 



\MF)-ai\<±}. 



Note that each updating operation after a hard query merely 
intersects d-i with the pair of halfspaces: 



y Fj < m<ii + 



50 



and 



Fj > msi- 



50 ' 



and so Cj is a convex polytope for each i. 

Dyer, Kannan, and Frieze [DFK91] show how to 5- 
approximate a random sample from a convex body K £ R' x 
in time polynomial in |X| and the running time of a member- 
ship oracle for K, where <5 can be taken to be exponentially 
small (which is more than sufficient for our purposes) . Their 
algorithm has two requirements: 

1. There must be an efficient membership oracle which 
can in polynomial time determine whether a point F £ 
K |x| lies in K. 



2. K must be 'well rounded': B 2 
where Bl, is the unit £2 ball in 



x\ 



c K c \X\B. 



Since d is given as the intersection of a set of explicit half- 
spaces, we have a simple membership oracle to determine 
whether a given point F £ d: we simply check that F lies 
on the appropriate side of each of the halfspaces. This takes 
time poly(|X I, m), since the number of halfspaces defining d 
is linear in the number of answers to hard queries given be- 
fore time i, which is never more than 20m In XI. Moreover, 



for each i we have d C Co C mB[ x] C mB^ 1 C \X\B. 
Finally, we can safely assume that C d by simply con- 
sidering the convex set C[ = d + B% instead. This will not 
affect our results. 

Therefore, we can implement the median mechanism in 
time pdfy(\X |, k) by using sets d as defined in this section, 
and sampling from them using the grid walk of DFK91]. Es- 
timation error in computing n and the median value of /, on 
d-i by random sampling rather than brute force is easily 
controlled via the Chernoff bound and can be incorporated 
into the proofs of Lemmas 14.31 and 14.51 in the obvious way. 
It remains to prove a continuous version of Lemma 14.71 to 
show that the efficient implementation of the median mech- 
anism is (e, 5)-useful on all but a negligibly small fraction of 
fractional histograms F. 

5.2 Usefulness for Almost All Distributions 

We now prove an analogue of Lemma 14.71 to establish a 
usefulness guarantee for the efficient version of the median 
mechanism. 

Definition 3. With respect to any set of k queries 
fx, . . . , fk and for any F* £ Co, define 

Good E (F*) = {F £ d> : max \fi(F) - fi(F*)\ < e} 

ie{l,2,...,fc} 

as the set of points that agree up to an additive e factor with 
F* on every query fi. 

Since databases D C X can be identified with their cor- 
responding histogram vectors F £ R' x ', we can also write 
Good e (D) when the meaning is clear from context. 

For any F* , Good E (_F*) is a convex polytope contained in- 
side Co- We will prove that the efficient version of the me- 
dian mechanism is (e, (5)-useful for a database D if 



,|X| 



Vol(Good e/10 o(£>)) > 1 



Vol(C* ) 



\X\' 



(12) 



We first prove that (|12[) holds for almost every fractional 
histogram. For this, we need a preliminary lemma. 

Lemma 5.2. Let L denote the set of integer points inside 
Co . Then with respect to an arbitrary set of k queries, 



Co C U Good e/400 (F). 



Fee 

Proof. Every rational valued point F £ Co corresponds 
to some (large) database D C X by scaling F to an 
integer-valued histogram. Irrational points can be arbitrar- 
ily approximated by such a finite database. By Proposi- 
tion 14.61 for every set of k predicates fi, . . . , ft, there is 
a database F* C X with \F*\ — m such that for each i, 
\.h(F*) - fi(F)\ < e/400. Recalling that the histograms 
corresponding to databases of size at most m are exactly 
the integer points in Co, the proof is complete. □ 



Lemma 5.3. All but an \X\ 
tograms F satisfy 

Vol(Good e/2 oo(F)) 



fraction of fractional his- 



Vol(Co) 



Proof. Let 
B — <F £ L 



Vol(Good e/400 (F)) 
Vol(Co) 



\X\ 



Consider a randomly selected fractional histogram F* £ Co- 
For any F £ B we have: 



Pr[F* £ Good E/400 (F)] 



Vol(Good £/40( ,(F)) 
Vol(Co) 



Since \B\ < \C\ < |X| m , by a union bound we can conclude 
that except with probability jy\wt, F* Good e/ / 40 o(F') for 
any F £ B. However, by Lemma [5721 F* £ Good e / 40 o(J r '') 
for some F' £ C Therefore, except with probability 1/\X \ m , 
F' £ C\B. Thus, since Good E/400 (i ? ') C Good e/200 (F*), 
except with negligible probability, we have: 

Vol(Good £/200 (F*)) > Vol(Good e/40 o(F , )) > 1 



Vol(Co) 



Vol(Co) 



X\ 2v 



□ 



We are now ready to prove the analogue of Lemma [4.7l for 
the efficient implementation of the median mechanism. 

Lemma 5.4. For every set of k queries fx,..., fk, for all 
but an 0(|X| _m ) fraction of fractional histograms F, the ef- 
ficient implementation of the median mechanism guarantees 
that: The mechanism answers fewer than 40mlog|X| hard 
queries, except with probability k exp( — fl(ena')), 

Proof. We assume that all answers to hard queries are 
e/100 accurate, and that |r» — fi\ < ^ for every i. By 
Lemmas 14. 31 and 14. 41 — the former adapted to accommodate 
approximating r% via random sampling — we are in this case 
except with probability k exp( — Q(ena')). 

We analyze how the volume of d contracts with the num- 
ber of hard queries answered. Suppose the mechanism an- 
swers a hard query at time i. Then: 

1 1 ^ 91 

ri < Ti -\ < ti H < . 

100 100 " 100 



Recall d = {F G d-i : \fi(F) - a t \ < e/50}. Suppose that 
Vol(Ci) > ^Vol(Ci-i). Then 



Vol(C<_ 



94 

^ ioo exp 



1 

50 



92 
100' 



a contradiction. Therefore, we have 

h 



\C k \ < 



94 
100 



Vol(Co), 



(13) 



if h of the fc queries are hard. 

Since all answers to hard queries are e/100 accurate, it 
must be that Good £ /ioo(-D) £ Cfc. Therefore, for an in- 
put database D that satisfies (| 121) — and this is all but an 
0(|X|~ m ) fraction of them, by Lemma [5.31 — we have 



Vol(C fe ) > Vol(Good e/X00 (£>)) > 



Vol(C* ) 



\X I 2 * 



(14) 



Combining inequalities (| 13p and (|14l) yields 

, 2m In \X\ An , . . 

ft < pr: — - < 40m In \X\, 

In — 

47 

as claimed. □ 



Lemmas 14.41 14.51 and 15.41 give the following utility guar- 
antee. 

Theorem 5.5. For every set fi, ■ ■ ■ ,fk of queries, for all 
but a negligible fraction of fractional histograms F , the ef- 
ficient implementation of the median mechanism is (e, <5)- 
useful with 8 — fcexp(— fl(ena')). 

5.3 Usefulness for Finite Databases 

Fractional histograms correspond to probability distribu- 
tions over X. Lemma 15.31 shows that most probability dis- 
tributions are 'good' for the efficient implementation of the 
Median Mechanism; in fact, more is true. We next show that 
finite databases sampled from randomly selected probability 
distributions also have good volume properties. Together, 
these lemmas show that the efficient implementation of the 
median mechanism will be able to answer nearly exponen- 
tially many queries with high probability, in the setting in 
which the private database D is drawn from some 'typical' 
population distribution. 
DatabaseSample(|D|): 

1. Select a fractional point F £ Co uniformly at random. 

2. Sample and return a database D of size |D| by drawing 
each x £ D independently at random from the prob- 
ability distribution over X induced by F (i.e. sample 
Xi £ X with probability proportional to Fi). 

Lemma 5.6. For \D\ as in © (as required for the Median 
Mechanism) , a database sampled by Databases ample (\D\) 
satisfies (|12l) except with probability at most 0(|X|~ m ). 



Proof. By lemma 15731 except with probability |X| 
the fractional histogram F selected in step 1 satisfies 

Vol(Good £/200 (F)) 1 



By lemma 14.61 when we sample a database D of size 
\D\ > 0((log|X|log 3 fclogl/e)/e 3 ) from the probability 
distribution induced by F, except with probability 5 — 

0(fc|X|- log3fc/E ), Good e/20 o(F) C Goode/iooP), which 
gives us condition (|12[) . □ 

We would like an analogue of lemma 15.31 that holds for 
all but a diminishing fraction of finite databases (which cor- 
respond to lattice points within Co) rather than fractional 
points in Co, but it is not clear how uniformly randomly 
sampled lattice points distribute themselves with respect to 
the volume of Co. If n » \X\, then the lattice will be fine 
enough to approximate the volume of Co, and lemma [5731 
will continue to hold. We now show that small uniformly 
sampled databases will also be good for the efficient v ersio n 
of the median mechanism. Here, small means n — o(^/\X\), 
which allows for databases which are still polynomial in the 
size of X. A tighter analysis is possible, but we opt instead 
to give a simple argument. 

Lemma 5.7. For every n such that n satisfies © andn = 

o(^/\X\), all but an 0(n 2 /\X\) fraction of databases D of 
size \D\ = n satisfy condition (|12l) . 

Proof. We proceed by showing that our Databas- 
eSample procedure, which we know via lemma 15.61 gen- 
erates databases that satisfy (|12l) with high probability, is 
close to uniform. Note that DatabaseSample first selects 
a probability distribution F uniformly at random from the 
positive quadrant of the i\ ball, and then samples D from 
F. 

For any particular database D* with = n we write 
Vru[D = D*] to denote the probability of generating D* 
when we sample a database uniformly at random, and we 
write Prjv[-D = D*] to denote the probability of generat- 
ing D* when we sample a database according to Databas- 
eSample. Let R denote the event that D* contains no 
duplicate elements. We begin by noting by symmetry that: 
PrcrLD = D*\R] = Pv N [D = D*\R] We first argue that 
PrLr[7?] and Prjv[-R] are both large. We immediately have 
that the expected number of repetitions in database D when 
drawn from the uniform distribution is (g)/[.X'|, and so 

Pry[-i_R] < . We now consider Prjv[-R]. Since F is a uni- 
formly random point in the positive quadrant of the l\ ball, 
each coordinate Fi has the marginal of a Beta distribution: 
F, ~ /3(1, \X\ - 1). (See, for example, |Dev86| Chapter 5). 
Therefore, E[Fi] = nnfnn+i) an< ^ so ^ ne ex P ec t e d number 
of repetitions in database D when drawn from Databas- 

1*1 



eSample is (7]) Eif I E[F?] = 
Privh-R] < 



I*l+i 



Therefore, 



Finally, let B be the event that database D fails to satisfy 
121. We have: 



PrfBl 

u 



Pt[B\R] ■ Pr[R]+Pr[B[ 



Pr[B\R] 

N 

Pr\B\R] 

N 



Pi[R] 

u 

Pr\R] 

u 



■Pv[B\ 

u 

Prh-R] 

u 



,R] 
,R] 



Prf- 

u 

Prf- 

u L 



PrfBl 

JV 

Pr n[B] 



PrulR] 



Pr N [R] 



+ Pt[- 

u 



•R] 



Vol(Co) 



i _ 2 ™ 2 

1 1*1 

0( m» 



+ 



x\ 



where the last equality follows from lemma [5751 which states 
that Prjv[-B] is negligibly small. □ 

We observe that we can substitute either of the above lem- 
mas for lemma [5T3l in the proof of lemma [5741 to obtain ver- 
sions of Thoerem 15,51 

Corollary 5.8. For every set /i, -..,/* of queries, 
for all but a negligible fraction of databases sampled by 
Database Sample, the efficient implementation of the me- 
dian mechanism is (e, S)-useful with S — kexp(—Q(ena')). 

Corollary 5.9. For every set /i, • • • of queries, for 
all but an n 2 /\X\ fraction of uniformly randomly sampled 
databases of size n, the efficient implementation of the me- 
dian mechanism is (e, 6) -useful with S = k exp(— O(ena')). 

6. CONCLUSION 

We have shown that in the setting of predicate queries, 
interactivity does not pose an information theoretic barrier 
to differentially private data release. In particular, our de- 
pendence on the number of queries k nearly matches the 
optimal dependence of log k achieved in the offline setting 
by [BLR08| . We remark that our dependence on other pa- 
rameters is not necessarily optimal: in particular, [DNR + 09] 
achieves a better (and optimal) dependence on e. We 
have also shown how to implement our mechanism in time 
poly(|Y|, fc), although at the cost of sacrificing worst-case 
utility guarantees. The question of an interactive mecha- 
nism with poly(|X|,fc) runtime and worst-case utility guar- 
antees remains an interesting o pen questio n. More generally, 
although the lower bounds of lDNR+09] seem to preclude 
mechanisms with run-time poly(log|A|) from answering a 
superlinear number of generic predicate queries, the ques- 
tion of achieving this runtime for specific query classes of 
interest (offline or online) remains largely open. Recently a 
representation-dependent impossibility result for the class of 
conjunctions was obtained by Ullman and Vadhan |UV10j : 
either extending this to a representation-independent im- 
possibility result, or circumventing it by giving an efficient 
mechanism with a novel output representation would be very 
interesting. 
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